Complexity control in a mixture model by Hardy-Weinberg equilibrium

نویسندگان

  • Ella Bingham
  • Heikki Mannila
چکیده

Amethod of complexity control in multinomial mixture modeling of multiple-marker genotype data, imposing the Hardy-Weinberg equilibrium (HWE) between the genotype values, is studied. This is a very natural restriction and known to hold at population level under modest assumptions. The hypothesis under study is that imposing this restriction will prevent overfitting and lead to a better model. This is shown to indeed be case. Experimental results on chromosomes 1 and 17 of the HapMap data demonstrate that the restricted model generalizes better to unseen data and also finds clusters that correspond better to the ethnic groups of the HapMap, when compared to a model without the HWE restriction.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Complexity control in a mixture model by the Hardy-Weinberg equilibrium

A method of complexity control in multinomial mixture modeling of multiple-marker genotype data, imposing the Hardy–Weinberg equilibrium (HWE) between the genotype values, is studied. This is a very natural restriction, and known to hold at population level under modest assumptions. The hypothesis under study is that imposing this restriction will prevent overfitting and lead to a better model....

متن کامل

Hardy Weinberg Equilibrium Testing and Interpretation: Focus on infection

Hardy-Weinberg equilibrium (HWE) holds when, in a closed population with random mating and without mutation and natural selection, genotype frequencies at any locus is a simple function of allele frequencies. Testing for HWE is now a common practice in population genetics and genetic association studies of non-communicable diseases; however, it is less-regarded, or sometimes miss-interpreted, i...

متن کامل

Association of Polymorphism at 3׳-UTR of Urokinase Gene with Risk of Calcium Kidney Stones

Urokinase might play a role in the formation of kidney stones. This study was done to determine the association between +4065 T/C polymorphism at the 3′-untranslated region of urokinase gene and calcium kidney stones. This Case-Control study was carried out on 70 cases with a history of calcium kidney stones and 70 controls from the Baqiyatallah hospital of Tehran in 2013. The study of polymorp...

متن کامل

بررسی تنوع ژنتیکی مارکر rs438601در جمعیت اصفهان: یک مارکر آگاهی‌دهنده در تشخیص‌های مولکولی هموفیلی B

Introduction: Hemophilia B is an X-linked recessive genetic disease caused by mutations in the coagulation Factor IX gene. Mutations in the Factor IX gene result in dysfunction or deficiency of coagulation factor of IX. Direct mutation analysis involves the ideal method for molecular diagnosis of the disease. However, due to the high number of identified mutations in the gen, the lack of a comm...

متن کامل

Distributions of Hardy-Weinberg equilibrium test statistics.

It is well established that test statistics and P-values derived from discrete data, such as genetic markers, are also discrete. In most genetic applications, the null distribution for a discrete test statistic is approximated with a continuous distribution, but this approximation may not be reasonable. In some cases using the continuous approximation for the expected null distribution may caus...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008